Theoretical and Empirical Analysis of a Spatial EA Parallel Boosting Algorithm

نویسندگان

  • Uday Kamath
  • Carlotta Domeniconi
  • Kenneth A. De Jong
چکیده

Many real-world problems involve massive amounts of data. Under these circumstances learning algorithms often become prohibitively expensive, making scalability a pressing issue to be addressed. A common approach is to perform sampling to reduce the size of the dataset and enable efficient learning. Alternatively, one customizes learning algorithms to achieve scalability. In either case, the key challenge is to obtain algorithmic efficiency without compromising the quality of the results. In this article we discuss a meta-learning algorithm (PSBML) that combines concepts from spatially structured evolutionary algorithms (SSEAs) with concepts from ensemble and boosting methodologies to achieve the desired scalability property. We present both theoretical and empirical analyses which show that PSBML preserves a critical property of boosting, specifically, convergence to a distribution centered around the margin. We then present additional empirical analyses showing that this meta-level algorithm provides a general and effective framework that can be used in combination with a variety of learning classifiers. We perform extensive experiments to investigate the trade-off achieved between scalability and accuracy, and robustness to noise, on both synthetic and real-world data. These empirical results corroborate our theoretical analysis, and demonstrate the potential of PSBML in achieving scalability without sacrificing accuracy.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Theoretical and Empirical Analysis of a Parallel Boosting Algorithm

Many real-world problems involve massive amounts of data. Under these circumstances learning algorithms often become prohibitively expensive, making scalability a pressing issue to be addressed. A common approach is to perform sampling to reduce the size of the dataset and enable efficient learning. Alternatively, one customizes learning algorithms to achieve scalability. In either case, the ke...

متن کامل

A Combined group EA-PROMETHEE method for a supplier selection problem

One of the important decisions which impacts all firms’ activities is the supplier selection problem. Since the 1950s, several works have addressed this problem by treating different aspects and instances. In this paper, a combined multiple criteria decision making (MCDM) technique (EA-PROMETHEE) has been applied to implement a proper decision making. To this aim, after reviewing the theoretica...

متن کامل

Parallel Spatial Pyramid Match Kernel Algorithm for Object Recognition using a Cluster of Computers

This paper parallelizes the spatial pyramid match kernel (SPK) implementation. SPK is one of the most usable kernel methods, along with support vector machine classifier, with high accuracy in object recognition. MATLAB parallel computing toolbox has been used to parallelize SPK. In this implementation, MATLAB Message Passing Interface (MPI) functions and features included in the toolbox help u...

متن کامل

Comparative Analysis of the Theoretical and Empirical of Industrial Revolution in the World in Order to Determine the Characteristics of Desirable Industrial Development

The purpose of this paper is to provide a comparative analysis of the theoretical and empirical industrial revolution in the world in order to determine the characteristics of desirable industrial development. In this regard, using a qualitative research method of historical type and library study, the empirical developments related to industry and industrial development were first reviewed and...

متن کامل

Optimization of Agricultural BMPs Using a Parallel Computing Based Multi-Objective Optimization Algorithm

Beneficial Management Practices (BMPs) are important measures for reducing agricultural non-point source (NPS) pollution. However, selection of BMPs for placement in a watershed requires optimizing available resources to maximize possible water quality benefits. Due to its iterative nature, the optimization typically takes a long time to achieve the BMP trade-off results which is not desirable ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Evolutionary computation

دوره 26 1  شماره 

صفحات  -

تاریخ انتشار 2018